A novel Bayesian Max-EWMA control chart for jointly monitoring the process mean and variance: an application to hard bake process

In this article, we introduce a novel Bayesian Max-EWMA control chart under various loss functions to concurrently monitor the mean and variance of a normally distributed process. The Bayesian Max-EWMA control chart exhibit strong overall performance in detecting shifts in both mean and dispersion across various magnitudes. To evaluate the performance of the proposed control chart, we employ Monte Carlo simulation methods to compute their run length characteristics. We conduct an extensive comparative analysis, contrasting the run length performance of our proposed charts with that of existing ones. Our findings highlight the heightened sensitivity of Bayesian Max-EWMA control chart to shifts of diverse magnitudes. Finally, to illustrate the efficacy of our Bayesian Max-EWMA control chart using various loss functions, we present a practical case study involving the hard-bake process in semiconductor manufacturing. Our results underscore the superior performance of the Bayesian Max-EWMA control chart in detecting out-of-control signals.


Bayesian approach
Bayesian theory, a foundational concept in statistics and probability, provides a unique and powerful framework for making inferences and drawing conclusions from data.Unlike traditional frequentist statistics, where parameters are considered fixed and unknown, Bayesian theory treats these parameters as probability distributions, allowing us to incorporate prior knowledge and update our beliefs as new evidence emerges.These prior distributions can be broadly categorized into two groups: non-informative and informative.Non-informative priors, such as Jeffreys and uniform priors, are commonly used, while informative priors frequently rely on conjugate priors, which are a widely recognized family of distributions.This approach not only offers a flexible and intuitive way to analyze data but also provides valuable insights into uncertainty, making it a fundamental tool in various fields, including science, engineering, and machine learning.Let's examine the study variable X within the confines of a controlled process, delineated by parameters θ (mean) and δ 2 (variance).In this scenario, we employ a normal prior, with θ 0 and δ 2 0 serving as its associated parameters, defined as follows: Generate the P distribution, it involves combining the likelihood function from the sample distribution with the prior distribution, forming a proportional relationship via multiplication.Consequently, the resulting P distribution, delineating the unknown parameter θ based on the observed data X, can be expressed as follows: The posterior predictive (PP) distribution is employed to predict future observations by considering the P distribution as prior distribution.It is frequently employed as a prior distribution for new data Y, facilitating predictions for upcoming observations while taking uncertainty into account.An integral component of Bayesian theory, the PP distribution enables the updating of prior distributions with new data.Its mathematical illustration is given below:

Squared error loss functions
In Bayesian estimation, the squared error LF (SELF) is a crucial tool for assessing the accuracy of parameter estimates.It measures the discrepancy between estimated and true values by squaring the difference between them.Bayesian estimation combines prior beliefs and observed data to infer unknown parameters.The SELF penalizes larger estimation errors more severely than smaller ones.The goal is to find the Bayesian posterior mean, minimizing the expected squared error under the posterior distribution.This approach leads to robust estimates, particularly when the posterior is approximately Gaussian.In this study, we employed Gauss's recommended LF 19 .The SELF, which considers both the variable of interest, denoted as X, and the estimator θ for the unknown population parameter θ , is expressed mathematically as follows: and Bayes estimator utilizing SELF is mathematized as:

Linex loss functions
An asymmetric LF in Bayesian analysis quantifies the penalties for incorrect predictions, unlike a symmetric one that treats errors equally.It assigns different weights to overestimations and underestimations based on their relative costs, incorporating prior beliefs about data distribution and outcomes' costs to improve Bayesian inference precision and efficiency.Varian 20 proposed the LLF to mitigate risks in Bayes estimation.The LLF is mathematically described as: Under LLF, the Bayesian estimator θ is mathematizied as where H(n, ν) is a chi-square distribution characterized by ν degrees of freedom, and φ −1 denotes the inverse of the standard normal distribution function, the computations for EWMA statistics regarding both the process mean and variance are outlined as follows: In this context,P 0 and Q 0 represent the initial values for the EWMA sequences P t and Q t , respectively, with (a constant within the range [0, 1]) denoting the smoothing constant.P t and Q t are also mutually independent due to the independence of P t and Q t .When considering an in-control process, both P t and Q t follow normal distributions, each with a mean of zero and variances of δ 2 P t and δ 2 Q t , respectively.This is defined as follows The plotting statistics, Bayesian Max-EWMA for jointly monitoring using P t(LF) and Q t(LF) is mathematically defined as: www.nature.com/scientificreports/ As the Bayesian Max-EWMA statistic is positive value, so we required to plot only the upper control limit for jointly monitoring of the process mean and variance, if the plotting statistic Z t within the UCL, then the process is in-control and if the Z t cross the UCL the process is out-of-control.

Performance evaluation
The performance of the proposed control charts has been assessed using Average Run Length (ARL) and Standard Deviation of Run Length (SDRL) as the key metrics.These metrics serve as the benchmark for evaluating the effectiveness of the control charts.The baseline ARL (ARL 0 ) and SDRL (SDRL 0 ) values represent the performance under normal, in-control conditions.On the other hand, ARL 1 and SDRL 1 denote the ARL and SDRL values, respectively, when the process deviates from the normal, indicating an out-of-control situation.The evaluation encompasses various mean shifts to provide a comprehensive understanding of the proposed charts' performance characteristics in different scenarios.We employed 50,000 replicates to calculate both the ARL and SDRL.The smoothing constants were set at λ = 0.10 and 0.25.Additionally, we explored various combinations of mean shift values, denoted as a = 0.00, 0.25, 0.50, 0.75, 1.00, 1.25, 1.50, 1.75, 2.00, 2.25, 2.50, 2.75, 3.00, as well as variance shift values, denoted as b = 0.25, 0.50, 0.75, 0.90, 1, 1.10, 1.25, 1.50, 2.00, 2.50, 3.00.These combinations were used in our study to assess the performance of the Bayesian Max-EWMA CC method in simultaneously monitoring both process mean and variance.The following simulation steps have been considered for the calculations of ARLs and SDRLs.
Step 1 Establishing the control limits 1. Start by setting up the initial control limits, determining the values for UCL and λ.
2. Create a random sample of size n, representing the in-control process by using normal distributions.
3. Compute the statistic for the proposed control chart.4. Check if the plotting statistic falls within the UCL If it does then proceed to steps (iii-iv) again.
Step 2 Assessing the out-of-control Average Run Length (ARL).
1. Create a random sample for the process with a shift.2. Compute the statistic of the proposed control chart.
3. If the plotting statistic falls within the UCL, repeat steps (i-ii).Otherwise, record the number of generated points, representing a single out-of-control run length.4. Repeat the above process (i-iii) 50,000 times to determine the out-of-control ARL 1 and SDRL 1 .
Tables 1, 2, 3 and 4 provide a framework for showcasing the outcomes derived from implementing the Bayesian Max-EWMA CC technique.This analysis meticulously evaluates the influence of distinct LFs tailored to emphasize the P and distributions, all assessed within the framework of informative priors.Based on the findings, the suggested Bayesian Max-EWMA CC designed for the simultaneous monitoring of production processes demonstrates an elevated degree of sensitivity when it comes to identifying signs of being out of control.Tables 1  and 2 provide compelling evidence that the Bayesian Max-EWMA CC, particularly under the SELF for P and PP distributions, efficiently detects shifts in both the process mean and variance in tandem.It is noteworthy that with each increment in the magnitude of the mean shift, there is a corresponding reduction in the values of ARLs.Similarly, each variance shift leads to a decrease in ARLs.These observations strongly suggest that this CC possesses the capability to promptly identify process shifts, making it a valuable tool for the early and comprehensive monitoring of production processes.For example, consider the ARL outcomes of the suggested Bayesian Max-EWMA CC when applying the SELF with a smoothing parameter = 0.10 , n = 5. i.e., a = 0.00, 0.25, 0.50, 0.75, 1.50, 3.00, while considered the corresponding shift in the process variance i.e., b = 1.The resulting ARL values for these shifts were 369.41, 28.27, 9.42, 5.59, 2.67, and 1.51, respectively.It is evident that as the magnitude of the mean shift increase, the ARL values significantly decrease.This observation underscores the higher efficiency of the suggested Bayesian Max-EWMA CC in detecting shifts in the process mean.Similarly, when we examine the impact of varying the process variance i.e., b = 0.25, 0.50, 0.75,1, 1.50, 3.00, with process mean a = 0.00.The ARL values are 3.39, 6.28, 20.59, 370.34, 7.87, and 1.76., the corresponding ARL values were 3.39, 6.28, 20.59, 370.34, 7.87, and 1.76.These results indicate that when the process variance changes from 1, the ARL outcomes decrease, demonstrating the significant performance of the proposed CC in detecting changes in process variance.Additionally, we observed from Table 2 that the performance of the suggested Bayesian Max-EWMA CC decreases as the smoothing constant values increase.This suggests that a lower smoothing constant may be more effective in certain scenarios.
Likewise, Tables 3 and 4 display the ARL results of the offered Bayesian Max-EWMA CC using the LLF with a fixed smoothing constant value of 0.25 and a sample size of 5. We considered various shifts in both the process i.e., a = 0.00, 0.25, 0.50, 0.75, 1.50, 3.00, along with the corresponding shift in the process variance i.e., b = 1 and obtained corresponding ARL values of 370.61, 44.74, 9.85, 4.96, 2.10, and 1.02.These results illustrate that as the process shifts increase, the ARL values decrease rapidly, indicating the accurate performance of the suggested Max-EWMA CC in detecting shifts in both the process mean and variance.Furthermore, it is important to note that the efficiency of the proposed CC for jointly monitoring the process mean and variance depends on the sample size.In Table 5, we have compared the suggested Bayesian Max-EWMA CC with the existing Bayesian EWMA CC using different values of smoothing constants i.e., = 0.10, 0.15 and 0.25 and with sample size n = 5.The ARL outcomes clearly shows that the proposed Bayesian Max-EWMA CC is more significantly identify signals indicating an out-of-control state more effectively than the existing Bayesian EWMA CC.From all the Tables 1, 2, 3, 4 and 5, it is evident that as the sample size increases, the ARL outcomes decrease, indicating the greater efficiency of the suggested CC in detecting deviations from the expected process parameters.The key findings of the study are given as: • The efficiency of the suggested Max-EWMA CC for simultaneously monitoring the process mean and vari- ance, and for detecting minor to moderate shifts, is evident from the run length profiles presented in all four tables associated with the suggested CC. • Based on the simulation results, it is observed that the performance of the suggested Bayesian CC for joint monitoring improves as the smoothing constant value decreases.• In the current study, one of the most crucial factors under consideration is the variability in sample size.The results obtained from our analysis provide a clear and compelling insight.It is evident that as the sample size increases, the effectiveness and performance of the suggested Bayesian Max-EWMA CC experience a substantial and notable improvement.

Real life data application
This article presents the practical implementation of the proposed Bayesian Max-EWMA CC.The data used for this demonstration is drawn from Montgomery 21 and pertains to the hard-bake process in semiconductor manufacturing.The dataset consists of 45 samples, each containing 5 wafers, resulting in a total of 225 data points.These measurements, in microns, represent the flow width, and the time interval between each sample is consistently set at 1 h.Of these samples, the initial 30, comprising 150 observations, are considered indicative of a controlled process and are labeled as the phase-I dataset.Conversely, the remaining 15 samples, totaling 75 observations, are designated as representative of an out-of-control process and are referred to as the phase-II and the complete dataset is available in the appendix A. Both charts are employed to monitor variations in the process mean, and the computed results are showcased in Table 6.
Figure 1 shows the Bayesian EWMA CC under SELF, in which all the points are within the control.Figures 2  and 3 provide a visual representation of the implementation of the recommended Bayesian Max-EWMA CC, designed to jointly monitor both the process mean and dispersion using the SELF approach.Upon closer examination of these charts, it becomes apparent that the process exhibits signals indicating it is out of control in the 39th and 43rd samples, for the smoothing constant values 0.10 and 0.25 respectively.Similarly, Figs. 4 and 5 depict the performance of the proposed CC using the LLF approach.These figures clearly show that the process displays out-of-control signals in the 40th and 42nd samples within the same context.This observation not only

Conclusion
This study introduces an innovative Bayesian Max-EWMA CC designed for concurrent monitoring of both process mean and variance.It utilizes informative prior distributions and incorporates two distinct LFs within the context of P distributions.The results, presented in Tables 1, 2, 3, and 4, evaluate the performance of the proposed CC using metrics such as ARL and SDRL.ARL plots (Figs. 1, 2, 3, 4) provide compelling evidence of the superior performance of the Bayesian CC.To further assess the CC under varying LFs, a practical example is applied to the semiconductor manufacturing hard bake process.Notably, the proposed Bayesian Max-EWMA CC, for both P distributions, excels in detecting out-of-control signals.Importantly, the principles of this study can be extended to other memory-type CCs.Moreover, this approach is not confined to normal distributions; it can be tailored for data conforming binomial or Poisson distributions, albeit requiring adjustments to the likelihood function.Expanding this innovative technique to non-normal distributions and various CC types can yield a more comprehensive understanding of underlying data.This, in turn, facilitates early detection of potential quality issues, enables swift corrective actions, and reduces the risk of costly errors and defects.In practical applications, such as healthcare, this approach aids in promptly identifying anomalies in patient data, allowing for timely interventions.Within finance, it has the ability to reveal fraudulent activities and potential errors in financial transactions.In manufacturing, broadening

Figure 2 .
Figure 2.Under SELF, the Bayesian Max-EWMA control chart for jointly monitoring with = 0.10.

Figure 3 .
Figure 3. Using SELF, the Bayesian Max-EWMA CC for jointly monitoring with = 0.25.

Table 1 .
ARL and SDRL outcomes for Bayesian Max EWMA CC applying P distribution applying SELF, with = 0.10.

Table 2 .
Runlength results for suggested CCBayesian Max EWMA CC applying P distribution under SELF, with = 0.25.

Table 3 .
The ARL and SDRL results of proposed CC using P distribution under LLF, with = 0.10.

Table 4 .
The run length profiles for Bayesian Max EWMA CC using P distribution under LLF, with = 0.25.

Table 5 .
The ARL and SDRL results for Bayesian EWMA and Bayesian Max-EWMA CC using P distribution under LLF, with = 0.10.